Cohesion and explicitation in an English-German translation corpus

نویسندگان

  • Silvia Hansen-Schirra
  • Stella Neumann
  • Erich Steiner
چکیده

In translation studies, cohesive features as indicators for explicitation have been analysed either in an example-based way (Blum-Kulka 1986) or as concordances in monolingually comparable corpora of raw text (cf. several contributions in Laviosa (ed.) 1998, Olohan & Baker 2000). In spite of the insight gained from this line of research, we argue that where explicitation is investigated without taking into account the source texts, the interpretation of results remains restricted and problematic. Work on translations against a more linguistic background has addressed some of these restrictions and problems (cf. relevant work as in Johansson & Oksefjell (eds.) 1998, Fabricius-Hansen 1999); the focus of these research interests and methodologies is however different from, and partly complementary to, ours with respect to corpus architecture, querying techniques and underlying linguistic modelling (for which cf. Hansen 2003, Neumann 2003, Steiner 2001, 2005a,b,c, Teich 2003). The basic assumption for the analysis of explicitation in the present paper is that the element explicitated in the target text has to be present implicitly in a linguistically traceable way in the source text and vice versa for implicitated elements. Explicitation is thus defined as a relationship and a process between instantiated and aligned pieces of translated texts. Furthermore, we stratify the notion of explicitation according to the linguistic levels of lexicogrammar (not in focus in this paper) and cohesion. As this stratification is still too abstract to be directly quantifiable on linguistic data in an electronic corpus, a series of further micro-level operationalisations is undertaken which are meant to bring the relevant phenomena down to an empirically accessible level. Our investigation of explicitation and implicitation of cohesion markers in translations is based on a cross-linguistic corpus containing statistically meaningful and representative samples (cf. Biber 1993) of German and English registerially parallel texts from 8 registers annotated with parts of speech, morphology, phrase structure and grammatical functions. In addition to these two sub-corpora, two further sub-copora have been compiled consisting of translations of the samples from the first two sub-corpora into the respective other language, yielding 4 sub-corpora. The overall corpus comprises 1 million words (approx. 250 000 for each of the four sub-corpora) plus 68,000 words in register-neutral (cross-register) reference corpora in both languages. A characteristic feature of our corpus is the alignment of source and target texts on different linguistically motivated layers: we not only align sentences (which is state of the art in Translation Memories; e.g. Johansson et al. 1996) and words (which is state of the art in Machine Translation; cf. Och & Ney 2003) but also clauses. One of the methodological principles for the compilation of the resource is the distinction between lexico-grammatical/ cohesive annotation of source and target language texts (including

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovery of Discourse-Related Language Contrasts through Alignment Discrepancies in English-German Translation

In this paper, we analyse alignment discrepancies for discourse structures in English-German parallel data – sentence pairs, in which discourse structures in target or source texts have no alignment in the corresponding parallel sentences. The discourse-related structures are designed in form of linguistic patterns based on the information delivered by automatic part-of-speech and dependency an...

متن کامل

Strategies Used in the Translation of Interlingual Subtitling

This study was an attempt to identify the interlingual strategies employed to translate English subtitles into Persian and to determine their frequency, as well. Contrary to many countries, subtitling is a new field in Iran. The study, a corpus-based, comparative, descriptive, non-judgmental analysis of an English-Persian parallel corpus, comprised English audio scripts of five movies of differ...

متن کامل

Translation Strategies in English to Persian Translation of Children's Literature based on Klingberg's Model

This research sought to identify the translation strategies adopted by the translator in Persian translation of 'whatever after, Fairest of all' written by 'Sarah Mlynowski' based on Klingberg's model (1986). To achieve the objectives of the study, a qualitative content analysis design was selected for it. The corpus of the study consisted of 60 pages of the novel 'whatever after, Fairest of al...

متن کامل

Towards a Literary Machine Translation: The Role of Referential Cohesion

What is the role of textual features above the sentence level in advancing the machine translation of literature? This paper examines how referential cohesion is expressed in literary and non-literary texts and how this cohesion affects translation. We first show in a corpus study on English that literary texts use more dense reference chains to express greater referential cohesion than news. W...

متن کامل

Multi-dimensional Annotation and Alignment in an English-German Translation Corpus

This paper presents the compilation of the CroCo Corpus, an English-German translation corpus. Corpus design, annotation and alignment are described in detail. In order to guarantee the searchability and exchangeability of the corpus, XML stand-off mark-up is used as representation format for the multi-layer annotation. On this basis it is shown how the corpus can be queried using XQuery. Furth...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006